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A local linear kernel estimator of the regression function x i— > 
(;(x) ~ E[yi|Xi = x], X £ R'*, of a stationary (d+ l)-dimensional spa- 
tial process {(Ki, Xi), i £ Z'^} observed over a rectangular domain 
of the form Xn := {i = (ii, . . . , iiv) G Z'^]! < ifc < nfe,fc = \,. . . ,N}, 
n= (ni,...,njv) £ Z'^, is proposed and investigated. Under mild 
regularity assumptions, asymptotic normality of the estimators of 
(/(x) and its derivatives is established. Appropriate choices of the 
band widths are proposed. The spatial process is assumed to satisfy 
some very general mixing conditions, generalizing classical time-series 
strong mixing concepts. The size of the rectangular domain Xn is al- 
lowed to tend to infinity at different rates depending on the direction 
in Z'^. 

1. Introduction. Spatial data arise in a variety of fields, including econo- 
metrics, epidemiology, environmental science, image analysis, oceanography 
and many others. The statistical treatment of such data is the subject of an 
abundant literature, which cannot be reviewed here; for background read- 
ing, we refer the reader to the monographs by Anselin and Florax (1995), 
Cressie (1991), Guyon (1995), Possolo (1991) or Ripley (1981). 

Let Z^, iV > 1, denote the integer lattice points in the iV-dimensional 
Euclidean space. A point i = (ii, . . . ,ij^) in "L^ will be referred to as a site. 
Spatial data are modeled as finite realizations of vector stochastic processes 
indexed by i G Z^: random fields. In this paper, we will consider strictly 
stationary (d -|- l)-dimensional random fields, of the form 

(1.1) {(yi,Xi);iGZ^}, 
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where Yj, with values in M, and Xj, with values in M"^, are defined over some 
probability space (O,^, P). 

A crucial problem for a number of applications is the problem of spatial 
regression, where the influence of a vector X; of covariates on some response 
variable Yi is to be studied in a context of complex spatial dependence. More 
specifically, assuming that has finite expectation, the quantity under study 
in such problems is the spatial regression function 

(7:x^g(x) := E[yi|Xi = x]. 

The spatial dependence structure in this context plays the role of a nuisance, 
and remains unspecified. Although g of course is only defined up to a P- 
null set of values of x (being a class of P-a.s. mutually equal functions 
rather than a function), we will treat it, for the sake of simplicity, as a 
well-defined real-valued x-measurable function, which has no implication 
for the probabilistic statements of this paper. In the particular case under 
which X; itself is measurable with respect to a subset of Ij's, with j ranging 
over some neighborhood of i, g is called a spatial autoregression function. 
Such spatial autoregression models were considered as early as 1954, in the 
particular case of a linear autoregression function g, by Whittle (1954, 1963); 
see Besag (1974) for further developments in this context. 

In this paper, we are concerned with estimating the spatial regression 
(autoregression) function g:xh^ ^(x); contrary to Whittle (1954), we adopt 
a nonparametric point of view, avoiding any parametric specification of the 
possibly extremely complex spatial dependent structure of the data. 

For = 1, this problem reduces to the classical problem of (auto)regression 
for serially dependent observations, which has received extensive attention in 
the literature; see, for instance, Roussas (1969, 1988), Masry (1983, 1986), 
Robinson (1983, 1987), loannides and Roussas (1987), Masry and Gyorfi 
(1987), Yakowitz (1987), Boente and Fraiman (1988), Bosq (1989), Gyorfi, 
Hardle, Sarda and Vieu (1989), Tran (1989), Masry and Tj0stheim (1995), 
Hallin and Tran (1996), Lu and Cheng (1997), Lu (2001) and Wu and Miel- 
niczuk (2002), to quote only a few. Quite surprisingly, despite its importance 
for applications, the spatial version (N > 1) of the same problem remains 
essentially unexplored. Several recent papers [e.g., Tran (1990), Tran and 
Yakowitz (1993), Carbon, Hallin and Tran (1996), Hallin, Lu and Tran 
(2001, 2004), Biau (2003) and Biau and Cadre (2004)] deal with the re- 
lated problem of estimating the density / of a random field of the form 
{Xi;i G Z^}, or the prediction problem but, to the best of our knowledge, 
the only results available on the estimation of spatial regression functions 
are those by Lu and Chen (2002, 2004), who investigate the properties of a 
Nadaraya- Watson kernel estimator for g. 

Though the Nadaraya-Watson method is central in most nonparamet- 
ric regression methods in the traditional serial case {N = 1), it has been 
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well documented [see, e.g., Fan and Gijbels (1996)] that this approach suf- 
fers from several severe drawbacks, such as poor boundary performance, 
excessive bias and low efficiency, and that the local polynomial fitting meth- 
ods developed by Stone (1977) and Cleveland (1979) are generally prefer- 
able. Local polynomial fitting, and particularly its special case — local linear 
fitting — recently have become increasingly popular in light of recent work 
by Cleveland and Loader (1996), Fan (1992), Fan and Gijbels (1992, 1995), 
Hastie and Loader (1993), Ruppert and Wand (1994) and several others. 
For = 1, Masry and Fan (1997) have studied the asymptotics of local 
polynomial fitting for regression under general mixing conditions. In this 
paper, we extend this approach to the context of spatial regression (N > 1) 
by defining an estimator of g based on local linear fitting and establishing 
its asymptotic properties. 

Extending classical or time-series asymptotics {N = 1) to spatial asymp- 
totics {N > 1), however, is far from trivial. Due to the absence of any canon- 
ical ordering in the space, there is no obvious definition of tail sigma-fields. 
As a consequence, such a basic concept as ergodicity is all but well de- 
fined in the spatial context. And, little seems to exist about this in the 
literature, where only central limit results are well documented; see, for in- 
stance, Bolthausen (1982) or Nakhapetyan (1980). Even the simple idea of 
a sample size going to infinity (the sample size here is a rectangular do- 
main of the form := {i = {ii, . . . ,1^) G Z^|l <ik < n^, k = 1, . . . , N}, for 
n = (ni, . . . , n^) £ TL^ with strictly positive coordinates ni, . . . , nAr) or the 
concept of spatial mixing have to be clarified in this setting. The assumptions 
we are making (A4), (A4') and (A4") are an attempt to provide reasonable 
and flexible generalizations of traditional time-series concepts. 

Assuming that x i— > ^(x) is differentiable at x, with gradient xi— > (/'(x), 
the main idea in local linear regression consists in approximating g in the 
neighborhood of x as 

5(z)«5(x) + (5'(x))"(z-x), 

and estimating ((^(x), (?'(x)) instead of simply running a classical nonpara- 
metric (e.g., kernel-based) estimation method for g itself. In order to do this, 
we propose a weighted least square estimator (g'n(x), ^'^(x)), and study its 
asymptotic properties. Mainly, we establish its asymptotic normality under 
various mixing conditions, as n goes to infinity in two distinct ways. Either 
isotropic divergence (n =^ oo) can be considered; under this case, observa- 
tions are made over a rectangular domain of which expands at the 
same rate in all directions — see Theorems 3.1, 3.2 and 3.5. Or, due to the 
specific nature of the practical problem under study, the rates of expansion 
of Tn cannot be the same along all directions, and only a less restrictive 
assumption of possibly nonisotropic divergence (n— >oo) can be made — see 
Theorems 3.3 and 3.4. 
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The paper is organized as follows. In Section 2.1 we provide the notation 
and main assumptions. Section 2.2 introduces the main ideas underlying 
local linear regression in the context of random fields and sketches the main 
steps of the proofs to be developed in the sequel. Section 2.3 is devoted to 
some preliminary results. Section 3 is the main section of the paper, where 
asymptotic normality is proved under the various types of asymptotics and 
various mixing assumptions. Section 4 provides some numerical illustrations. 
Proofs and technical lemmas are concentrated in Section 5. 

2. Local linear estimation of spatial regression. 

2.1. Notation and main assumptions. For the sake of convenience, we 
summarize here the main assumptions we are making on the random field 
(1.1) and the kernel K to be used in the estimation method. Assumptions 
(Al)-(A4) are related to the random field itself. 

(Al) The random field (1.1) is strictly stationary. For all distinct i and j 
in Z^, the vectors Xi and Xj admit a joint density /jj; moreover, 
|/ij(x',x") - /(x')/(x")| < C for all i,j G Z^, ah x',x" e M*^, where 
C > is some constant, and / denotes the marginal density of Xi. 

(A2) The random variable l^i has finite absolute moment of order (2 + 6); 
that is, E[|yip+^] < oo for some 5>0. 

(A3) The spatial regression function g is twice differentiable. Denoting by 
g'{x) and g"{x) its gradient and the matrix of its second derivatives 
(at x), respectively, g"{x.) is continuous at all x. 

Assumption (Al) is standard in this context; it has been used, for instance, 
by Masry (1986) in the serial case = 1, and by Tran (1990) in the spatial 
context {N > 1). If the random field X{ consists of independent observations, 
then |/ij(x,x") — /(x')/(x")| vanishes as soon as i and j are distinct. Thus 
(Al) also allows for unbounded densities. 

Assumption (A4) is an assumption of spatial mixing taking two distinct 
forms [either (A4) and (A4') or (A4) and (A4")]. For any collection of sites 
S C Z^, denote by B{S) the Borel tj-field generated by {(yi,Xi)| i G S}; 
for each couple S',S", let d{S',S") := min{||i' - i"|| | i' G S',i" G S"} be the 
distance between S' and S" , where ||i|| := (if + • • • + i^)"*^/^ stands for the 
Euclidean norm. Finally, write Card(5) for the cardinality of S. 

(A4) There exist a function ip such that (p{t) [0 as f — > oo, and a function 
'(/' : — > M"^ symmetric and decreasing in each of its two arguments, 
such that the random field (1.1) is mixing, with spatial mixing coeffi- 
cients a satisfying 

a{B{S'),B{S")) := sup{|P(^B) - P{A)F{B)\,A G B{S'),B G B(cS")} 
<^(Card(5'),Card(5"))¥'(d('5',5")), 

(2.1) 
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for any S',S" C Z^. The function ip, moreover, is such that 

oo 

hm m'^y j^~'{ip{j)Y/'^^+^^ =0 

m— +00 ^ — ' 

j=m 

for some constant a > (4 + 6)N/ {2 + S). 
The assumptions we are making on the function i/j are either 
(A4') V(n',n") <min(n',n") 
or 

(A4") ipin', n") < C{n' + n" + I)'' for some C > and k > 1. 

In case (2.1) holds with ip = 1, the random field {(yjjX;)} is called strongly 
mixing. 

In the serial case (A^ = !)> many stochastic processes and time series are 
known to be strongly mixing. Withers (1981) has obtained various con- 
ditions for linear processes to be strongly mixing. Under certain weak as- 
sumptions, autoregressive and more general nonlinear time-series models are 
strongly mixing with exponential mixing rates; see Pham and Tran (1985), 
Pham (1986), Tj0stheim (1990) and Lu (1998). Guyon (1987) has shown 
that the results of Withers under certain conditions extend to linear ran- 
dom fields, of the form X,^ = J2jez^ fi'j-^n-ji where the Zj's are independent 
random variables. Assumptions (A4') and (A4") are the same as the mixing 
conditions used by Neaderhouser (1980) and Takahata (1983), respectively, 
and are weaker than the uniform strong mixing condition considered by 
Nakhapetyan (1980). They are satisfied by many spatial models, as shown 
by Neaderhouser (1980), Rosenblatt (1985) and Guyon (1987). 

Throughout, we assume that the random field (1.1) is observed over a 
rectangular region of the form Tn := {i = (ii, . . . , zn) S Z^| I < ik ^ n-k,k = 
1, . . . , N}, for n = (ni, . . . , n^) G with strictly positive coordinates ni, . . . , n^r. 
The total sample size is thus n := O^i'^fc- We write n — > 00 as soon as 
mini<fc<7v{rafc} — > 00. The rate at which the rectangular region expands thus 
can depend on the direction in Z,^ . In some problems, however, the assump- 
tion that this rate is the same in all directions is natural: we use the no- 
tation n ^ 00 if n — s- 00 and moreover \nj/nk\ < C for some < C < 00, 
^ ^ j,k < N . In this latter case, n tends to infinity in an isotropic way. The 
nonisotropic case n ^ 00 is less restrictive. For more information on the 
nonisotropic case, we refer to Bradley and Tran (1999) and Lu and Chen 
(2002). 

Assumption (A5) deals with the kernel function A' : M'^ ^ M to be used in 
the estimation method. For any c := (co,cJ)^ G M'^"'""^, define 

(2.2) Ac(u):=(co + cIu)i^(u). 
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(A5)(i) For any c G R'^"'""'^, |Erc(u)| is uniformly bounded by some constant 
, and is integrable: J^d+i \Kc{:>i.) \ dx < oo. 
(ii) For any c G M"^^^, \Kc\ has an integrable second-order radial majo- 
rant, that is, Q^{x) := sup||y||>||x|| [||y|P-?^c(y)] is integrable. 

Finally, for convenient reference, we list here some conditions on the 
asymptotic behavior, as n ^ oo, of the bandwidth that will be used 
in the sequel. 

(Bl) The bandwidth tends to zero in such a way that ^ oo as 
n ^ oo. 

(B2) There exist two sequences of positive integer vectors, p = pn := (pi, ■ • ■ ,Pn) S 
and (I = (In '■= {q, ■ ■ ■ , q) ^ with q = qn ^ oo such that p = 
Pn -=P = o((n6n)^/^), q/pk and Uk/pk ^ oo for ah A; = 1, . . . , iV, 
and n^p{q) — > 0. 

(B2') Same as (B2), but the last condition is replaced by {h'^~^^ /p)ip{q) 0, 

where k is the constant appearing in (A4"). 
(B3) bn tends to zero in such a manner that qh^^^'^^'^'^^^^ > i and 

oo 

(2.3) -0 as n^oo. 

t=q 

2.2. Local linear fitting. Local linear fitting consists in approximating, 
in a neighborhood of x, the unknown function g hy a, linear function. Under 
(A3), we have 

g{z) w g{x) + (g'(x))^(z - x) := oq + a[(z - x). 

Locally, this suggests estimating (ao,aJ) = {g{x),g'{x)), hence constructing 
an estimator of g from 

'5'n(x)\ _ /ao~ 



.2 4^ x5n(x)/ Vai, 

:=arg min V (Fj - oq - a[(Xj - x))^^'^'^'' " 

where is a sequence of bandwidths tending to zero at an appropriate rate 
as n tends to infinity, and K{-) is a (bounded) kernel with values in M"*". 

In the classical serial case (A'" = 1; we write i and n instead of i and n), the 
solution of the minimization problem (2.4) is easily shown to be (X'^WX)~^X'^WY, 
where X is an n x (d + 1) matrix with ith row (l,&~^(Xj — x)"^), W = 
6-Miag(i^(^), . . . , i^(^)), and Y = (n, . . . ,YnY [see, e.g.. Fan and 
Gijbels (1996)f. In the spatial case, things are not as simple, and we rather 
write the solution to (2.4) as 

(!'l\=Un^Vn where V„:=f^"o') and U„:=f^"00 """i V 
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with [letting (^)o := 1] 

and 



J t-i-n 

It follows that 

H„: = 

(2.5) 



ao-ao 5'n(x)-5(x) 
^ai6n-ai6n/ V (S'nW - c/'(x))6n 
U„-'{V„-U„(^°«J}=:U„-'W„ 



where 



WnO 



and Zj := >j — ao — a[(Xj — x). 

The organization of the paper is as follows. If, under adequate conditions, 
we are able to show that: 

(CI) (n6n)"^'^^(Wn - EWn) is asymptotically normal, 
(C2) (n6^)i/2EWn ^ and Var((n6J^)i/2Wn) ^ S, and 

(C3) U„^U, 

then (2.5) and Slutsky's classical argument imply that, for all x (all quan- 
tities involved indeed depend on x), 

This asymptotic normality result (with explicit values of S and U), under 
various forms (depending on the mixing assumptions [(A4') or (A4")], the 
choice of the bandwidth 6n, the way n tends to infinity, etc.), is the main 
contribution of this paper; see Theorems 3.1-3.5. Section 2.3 deals with 
(C2) and (C3) under n — > c« (hence also under the stronger assumption 
that n =^> oo), and Sections 3.1 and 3.2 with (CI) under n oo and n oo, 
respectively. 
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2.3. Preliminaries. Claim (C3) is easily established from the following 
lemma, the proof of which is similar to that of Lemma 2.2, and is therefore 
omitted. 

Lemma 2.1. Assume that (Al), (A4) and (A5) hold, that satisfies 
assumption (Bl) and that uj^b^^^"'^'^^^^^ > 1 as oo. Then, for all x, 



U„ ^U: = 



/(x) J K{u)du /(x) J u-K{u)du 
^/(x) J uK{u)dvi /(x) J uu''K{u)du 



as n — > cx) . 



The remainder of this section is devoted to claim (C2). The usual Cramer- 
Wold device wih be adopted. For aU c := (co,c[)^ G let 



:= (n6^)V2c-W„ = (hbir'^^ ^ Z^K, 



fen 



with Kc{u) defined in (2.2). The following lemma provides the asymptotic 
variance of A^ for all c, hence that of (n6^)^/^Wn. 



Lemma 2.2. Assume that (Al), (A2), (A4) and (A5) hold, that b^ sat- 
isfies assumption (Bl) and that n^fen'^'''^^'''^^"' > 1 for all k = 1, . . . , N , as 
n ^ oo . Then 

(2.7) Jim^Var[An] = Var(yj|Xj = x)/(x) J J^ci.-^) = c^^c, 



jK^{u)du Ju-'K^{u)du 
uK'^{u)du [ uu^i4'2(u)du 



where 

5]:=Var(yj|Xj=x)/(x) 



Hence limn_,oo Var((n6ji) ^^^n) = S. 
For the proof see Section 5.1. 

Next we consider the asymptotic behavior of E[^n]' 
Lemma 2.3. Under assumptions (A3) and (A5), 

E[An] = VM4,biy{^) tr /(x) J uu-Ae(u)(iu + o{VM4,bl) 



(2. 



V^lblicoBoi^) + cIBi(x)] + o{VIilbl), 
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where 



d d „ 

So(x) := i/(x)^^c/ij(x) / UiUjK{u)du, 



i=ij=i 

d d „ 

Bi(x) := |/(x)^^c/ij(x) / UiUjuK{u)du, 

i=ij=i 

gij{x) = d'^g{x)/dxidxj, i,j = l,...,d, and u := {ui,...UdY SM'^. 
For the proof see Section 5.2. 
3. Asymptotic normality. 

3.1. Asymptotic normality under mixing assumption (A4'). The asymp- 
totic normahty of our estimators rehes in a crucial manner on the fohowing 
lemma [see (2.6) for the definition of Wn(x)]. 

Lemma 3.1. Suppose that assumptions (Al), (A2), (A4), (A4') and 
(A5) hold, and that the bandwidth satisfies conditions (B1)-(B3). Denote 
by fj^ the asymptotic variance (2.7). Then (n6^)i/2(c^[Wn(x) -EWn(x)]/(T) 
is asymptotically standard normal as n —> oo . 

For the proof see Section 5.3. 

We now turn to the main consistency and asymptotic normahty results. 
First, we consider the case where the sample size tends to oo in the manner 
of Tran (1990), that is, n =^> oo. 

Theorem 3.1. Let assumptions (Al)-(A3), (A4') and (A5) hold, with 
if{x) = 0{x~^) for some /i > 2(3 + 6)N/6. Suppose that there exists a se- 
quence of positive integers Q = Qn 

oo such that = o{{nb'^)^/^'^^^) and 
hq~^ ^0 as n =^ oo, and that the bandwidth tends to zero in such a 
manner that 

(3.1) qbma(2+5)] > ^ 

for some (4 + 5)N/{2 + 6) < a < fi5/{2 + 5) - N as n oo. Then, 



(3.2) ^ 



5n(x) - 9(x) ^ _ -^j-l / Bo(x) . ^; 



bM^)-9'{-^))) VBi(x) 



u2 



^AA(0,U-iS(U-i)") 

as n ^ oo, where U, Xl, Bq{x) and Bi(x) are defined in Lemmas 2.1, 2.2 
and 2.3, respectively. If, furthermore, the kernel K{-) is a symmetric density 
function, then (3.2) can be reinforced into 

f{iibiy/ngr.{x)-g{x)-B,{x)bl]\c,rf^fai{^ \\ 
I {nbty/'[g'^ix)-g'ix)] j I"' I aj{x) ) ) 
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[so that Qni^) <ind 5'n(x) are asymptotically independent], where 

, , ,o , , n, , Yav(Y^\X^=x) f K'^(u)du 
5,(x) := i^5n(x) J {u)^K{u)dn, agCx) := ^ ^' ^ 



and 



a, X : = 



/(x) 

Var(yj|Xj=x) 
/(x) 

j uu^K(u)du y uu^i^2(u)(iu uu^K(u)du 



-1 



The asymptotic normality results in Theorem 3.1 are stated for (7n(x) 
and S'n(x) at a given site x. They are easily extended, via the traditional 
Cramer- Wold device, into a joint asymptotic normality result for any couple 
(xi,X2) (or any finite collection) of sites; the asymptotic covariance terms 
[between 5n(xi) and 5'n(x2), gni^i) and 5^(x2), etc.] all are equal to zero, 
as in related results on density estimation [see Hallin and Tran (1996) or Lu 
(2001)]. The same remark also holds for Theorems 3.2-3.5 below. 

Proof of Theorem 3.1. Since q is o((n6n)^/^^), there exists Sn ^ 
such that q = (hbi)^/^^ s^. Take Pk := (n6^1)^/2iVsn'^^ A; = l,...,iV. Then 
q/pi^ = sl/^ ^ 0, p = (n6^)i/2gjy/2 ^ o((fi5d)i/2) f^^^g) ^ _^ q 

n ^ oo, p := p < (n6^)^/^ for large n. It follows that h/p > (n^""^)^/^ oo, 
hence Uk/pk — > oo for all k. Thus, condition (B2) is satisfied. 
Because ip{j) = Cj~^^, 

oo oo 

^ .N-ly^-^Y/[2+5) ^ ^^a ^ -N -1 ■-^.5 / [2+5) 
j=m j=Tn 

< Cm'^m^"^^/(2+5) ^ ^_[,.V(2+5)-a-7V] 

a quantity that tends to zero as m — > oo since (4 + 5)N/{2 + 5) < a < 
fi6/{2 + 6) - N, hence //(5/(2 + 5)> a + N. Assumption (A4) and the fact 
that > 1 ^j^piy ^-^^^ f^-Sd/(2+5) ^ ^^lat (2.3) holds. Now 

Hn - U-^EWn = U„i(Wn - EWn) + (U„i - U-i)EW„. 
The theorem thus follows from Lemmas 2.1, 2.3 and 3.1. □ 

One of the important advantages of local polynomial (and linear) fitting 
over the more traditional Nadaraya-Watson approach is that it has much 
better boundary behavior. This advantage often has been emphasized in the 
usual regression and time-series settings when the regressors take values on 
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a compact subset of M"^. For example, as Fan and Gijbels (1996) and Fan 
and Yao (2003) illustrate, for a univariate regressor X with bounded support 
([0, 1], say; here, d=l), it can be proved, using an argument similar to the 
one we develop in the proof of Theorem 3.1, that asymptotic normality still 
holds at the boundary point x = cbn (here c is a positive constant), but with 
asymptotic bias and variances 



(3.3) 



and 



Var(yj|Xj = 0+)nK2(n)dn 



(3.4) al-.- 



Var(Yj|Xj = 0+) 
/(0+) 



u K(u) du 



respectively. This advantage is likely to be much more substantial as N 
grows. Therefore, results on the model of (3.3) and (3.4) on the boundary 
behavior of our estimators would be highly desirable. Such results, however, 
are all but straightforward, and we leave them for future research. On the 
other hand, the statistical relevance of boundary effects is also of lesser 
importance, as the ultimate objective in random fields, as opposed to time 
series, seldom consists in "forecasting" the process beyond the boundary of 
the observed domain. 

In the important particular case under which ip{x) tends to zero at an 
exponential rate, the same results are obtained under milder conditions. 



Theorem 3.2. Let assumptions (Al)-(A3), (A4') and (A5) hold, with 
(p{x) = 0{e~^^) for some ^ > 0. Then, if tends to zero as n =^ co in such 
a manner that 

(3.5) (n6^^(i+2^^H2+5)))i/27V^^^g .^-1 ^ ^ 

for some a > (4 + 6)N/{2 + 5), the conclusions of Theorem 3.1 still hold. 



Proof. By (3.5), there exists a monotone positive function n i-^ <?(n) 
such that (7(n) oo and (n5^(^+^^'^/'^(^+'^)) ^/"^^ {g{n) logn)"^ ^ oo as n ^ oo 
Let q := {nbi)^/^^ {g{n))-\ and pk := {nbiy/^^ g-^/\n). Then q/pk = g-^^H 
0, p = {hb'}^)^^'^g~^^'^{n) = o((nfe^)^/^) and nk/pk ^ oo as n =^ oo. For arbi- 
trary C > 0,q> Clogn for sufficiently large n. Thus 

n^{q) < C7ne-«'? < Cnexp(-C7^1ogn) = Ch-^^+\ 
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which tends to zero if we choose C > 1/^. Hence condition (B2) is satisfied. 
Next, for < ^' < ^, 

oo oo 



i=q i=q 

oo 



i=q 

Note that 6^ ^ Cn~^ and g > Clogfi, so that assumption (A4) holds. In 
addition, 

for n large enough. It is easily verified that this implies that condition (B3) 
is satisfied. The theorem follows. □ 

Note that, in the one-dimensional case = 1, and for "large" values of a, 
the condition (3.5) is "close" to the condition that — > oo, which is usual 
in the classical case of independent observations. 

Next we consider the situation under which the sample size tends to oo 
in the "weak" sense (i.e., n — > oo instead of n =^ oo). 

Theorem 3.3. Let assumptions (Al)-(A3), (A4') and (A5) hold, with 
ip{x) = 0{x~^) for some fi > 2(3 + 5)N/5. Let the sequence of positive inte- 
gers g = (7n — > oo and the bandwidth factor into b^ ■= YliLi bm, such that 
hq-f'^O, q = o{mmi<k<N{nkKj^/'^), and 

ql,Sd/a(2+S) ^ ^ ^^^g ^4 ^ '5)A^/(2 + S)<a< fi6/{2 + S)-N. 

Then the conclusions of Theorem 3.1 hold as n^oo. 

Proof. Since q = o(mini<fc<Ar(nfc6^^)^/^), there exists a sequence Sn^, 
such that 

q= min ((ni,65^, )"^^^s„, ) asn— >oo. 

Take pk = {n,bij^/^si{^ Then q/p^ < s]l^ - 0, p = (n6^)i/2 n^^^ s]l^ = 
o{{hb'^)^/'^) and iif{q) = nq~^ ^ 0. As n — > oo, pk < {nkh'^^Y/'^ ^ hence n^/pk > 
{nkb~^)^^'^ — > oo. Thus condition (B2) is satisfied. The end of the proof is 
entirely similar to that of Theorem 3.1. □ 

In the important case that (p{x) tends to zero at an exponential rate, we 
have the following result, which parallels Theorem 3.2. 
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Theorem 3.4. Let assumptions (Al)-(A3), (A4') and (A5) hold, with 
cp{x) = 0(e-«^') for soTfic ^ ^ 0. Let the bcLTidwidth factov iftto bn '. — 
YliLi bm i'^ such a way that, as oo, 

(3.6) ^min^{(n,6;^J^/2}6^^/'^(2+5)(i^gfi)-i ^ ^ 

for some a > (4 + 6)N/{2 + 5) . Then the conclusions of Theorem 3.1 hold as 
n oo. 



Proof. By (3.6) there exist positive sequences indexed by such that 
Qn^. t oo as rifc ^ oo and 



min |(nfc6:^ji/25„-,'}6f /'^(2+5)(i^g^)-i ^ ^ 



l<k<N 



as OO. Let (? := mini<fc<Ar{(nfc6^Ji/2(5r„J 1} andpfc := {ukbi^Y/^ Qnl^"^ ■ 

Then < gnl'^ ^ 0, p = (n6^)i/2 nf^, g'^^ = o((n5^) V^) and n^/p, = 

{nkbn^)^^'^9ri^ — > OO as n — > oo. For arbitrary C > 0, (7 > Clogn for suffi- 
ciently large fi. Thus 

hip{q) < Cne-^^ < Cn exp(-Ceiogn) = Ch~^^+\ 

which tends to zero for C > 1/^. Hence, condition (B2) is satisfied. Next, 
for < ^' < ^, 

00 00 

i=q i=q 

00 

i=q 

KCq'^e-^'^^/^^+^l 

Note that q > Clogn. Assumption (A4') and (3.1) imply that qbif^"'^'^^^^ > 1 
for n large enough. This in turn implies that condition (B3) is satisfied. The 
theorem follows. □ 



3.2. Asymptotic normality under mixing assumption (A4"). We start 
with an equivalent, under (A4"), of Lemma 3.1. 

Lemma 3.2. Suppose that assumptions (Al), (A2), (A4) or (A4"), and 
(A5) hold, and that the bandwidth bn satisfies conditions (Bl), (B2') and 
(B3). Then the conclusions of Lemma 3.1 still hold as n^oo. 
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Proof. The proof is a slight variation of the argument of Lemma 3.1, 
and we describe it only briefly. The only significant difference is in the check- 
ing of (5.18). Let Ui, . . . ,Um be as in Lemma 3.1. By Lemma 5.3 and as- 
sumption (A4"), 

M 

Qi<C^[p+(M-i)p+l]Xg) 

i=l 

which tends to zero by condition (B2'); (5.18) follows. □ 

We then have the following counterpart of Theorem 3.1. 

Theorem 3.5. Let assumptions (Al)-(A3), (A4") and (A5) hold, with 
(p{x) = 0{x~'^) for some /i > 2(3 + 6)N/6. Suppose that there exists a se- 
quence of positive integers q = ^ oo such that q^ = o((n6^)^/^^) and 
^K+iq-fi-N as oo, and that the bandwidth bn tends to zero in such 
a manner that (3.1) is satisfied as n =^ oo. Then the conclusions of Theo- 
rem 3.1 hold. 

Proof. Choose the same values for pi, . . . ,pAr and q as in the proof of 
Theorem 3.1. Note that, because p> q^ and n'^"'~^g~^~^ = o(l), 

{h^+'^ /p)^{q) < Ch^+^q-^q-^" = h^+\~^~^ ^ 

as n ^ oo. The end of the proof is entirely similar to that of Theorem 3.1, 
with Lemma 3.2 instead of Lemma 3.1. □ 

Analogues of Theorems 3.2-3.4 can also be obtained under assumption (A4"); 
details are omitted for the sake of brevity. 

4. Numerical results. In this section, we report the results of a brief 
Monte Carlo study of the method described in this paper. We mainly con- 
sider two models, both in a two-dimensional space {N = 2) [writing {i,j) 
instead of {ii,i2) for the sites i £ For the sake of simplicity, X (written 
as X) is univariate {d = 1). 

(a) Model 1. Denoting by {uij, {i,j) € Z^} and {ejj, (i, j) G Z^} two mu- 
tually independent i.i.d. AA(0, 1) white-noise processes, let 

Yij = giXij) + Uij with g{x) := ^e"^ + fe"^, 

where {Xij, G Z^} is generated by the spatial autoregression 

Xij = sin(Xj_ij + Xij-i + Xi^ij + Xij^i) + Cij. 
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(b) Model 2. Denoting again by {eij, G Z^} an i.i.d. AA(0, 1) white- 
noise process, let {Yij, G Z^} be generated by 



mal prediction of Yi j based on Xfj in the sense of minimal mean squared 
prediction error. Note that, in the spatial context, this optimal prediction 
function g{-) generally differs from the spatial autoregression function itself 
[here, sin(-)]; see Whittle (1954) for details. Beyond a simple estimation of g, 
we also will investigate the impact, on prediction performance, of including 
additional spatial lags of Yij into the definition of Xij. 

Data were simulated from these two models over a rectangular domain 
of m X n sites — more precisely, over a grid of the form {{i,j)\76 < i < 
75 + m, 76 < J < 75 + n}, for various values of m and n. Each replication was 
obtained iteratively along the following steps. First, we simulated i.i.d. ran- 
dom variables Cij over the grid {(i, j), i = 1, . . . , 150 + m, j = 1, . . . , 150 + n}. 
Next, all initial values of Yij and Xij being set to zero, we generated Yij's 
(or Xij^s) over {{i,j),i = l,..., 150 + m, j = 1, . . . , 150 + n} recursively, using 
the spatial autoregressive models. Starting from these generated values, the 
process was iterated 20 times. The results at the final iteration step for 
inside {(i, j)|76 < i < 75 + jn, 76 < j < 75 + n} were taken as our simulated 
m X n sample. This discarding of peripheral sites allows for a warming-up 
zone, and the first 19 iterations were taken as warming-up steps aiming at 
achieving stationarity. From the resulting m x n central data set, we esti- 
mated the spatial regression/prediction function using the local linear ap- 
proach described in this paper. A data-driven choice of the bandwidth in this 
context would be highly desirable. In view of the lack of theoretical results 
on this point, we uniformly chose a bandwidth of 0.5 in all our simulations. 
The simulation results, each with 10 replications, are displayed in Figures 1 
and 2 for Models 1 and 2, respectively. Model 1 is a spatial regression model, 
with the covariates Xij forming a nonlinear autoregressive process. Inspec- 
tion of Figure 1 shows that the estimation of the regression function g{-) is 
quite good and stable, even for sample sizes as small as m = 10 and n = 20. 

Model 2 is a spatial autoregressive model, where l^j forms a process with 
nonlinear spatial autoregression function sin(-). Various definitions of Xij, 
involving different spatial lags of Yij, yield various prediction functions, 
which are shown in Figures 2(a)-(f). The results in Figures 2(a) and (b) 
correspond to Xij = Xfj := + + ^i+ij + Yij+i, that is, the lags 

of order ±1 of Yij which also appear in the generating process (4.1). In 



Yij = sin(yi_ij + Yij^i + Yi+ij + lij+i) + Sij, 



and set 




x] provides the opti- 
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i 1 1 

-2 2 

X 

Fig. 1. Simulation for Model 1. The local linear estimates corresponding to 
the 10 replications {solid lines) and actual spatial regression curve (dotted line) 
g{x) — ¥^{Yij\Xij — x) = ie^ + §e~^, for sample size m — lQ,n = 20, with autoregressive 
spatial covariate Xij . The scatterplot shows the observations {Xij,Yij) corresponding to 
one typical realization among 10. 

Figure 2(a), the sample sizes m = 10 and n = 20 are the same as in Figm'e 1, 
but the results (still, for 10 rephcations) are more dispersed. In Figure 2(b), 
the sample sizes (m = 30 and n = 40) are slightly larger, and the results 
(over 10 replications) seem much more stable. These sample sizes therefore 
were maintained throughout all subsequent simulations. In Figure 2(c), we 
chose 

thus including lagged values of Yij up to order ±2, in an isotropic way. 
Nonisotropic choices of Xij were made in the simulations reported in Fig- 
ures 2(d)-(f): Xfj := Y,_ij + Yi,j_i in Figure 2(d), Xf^^ := Yi+ij+Yi,j+i in 

Figure 2(e) and xfj := Yi_2,j + Yij_2 + + i^ij-i in Figure 2(f). 
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(3) (b) {<:) 




Fig. 2. Simulation for Model 2. The local linear estimates correspond- 
ing to the 10 replications (solid lines) of the spatial prediction func- 
tion g{x) = F,{Yij\Xij — x), with sample sizes m = 10, n = 20 m (a) 
and m — 30, n = 40 in (b)-(f), for different spatial covariates Xij's: 
(a) X°j := y,_i,, + + + Yi,,+i; (b) X?,, := Yi-i,, + + + r^.^+i; 

(c) Xlj ■- Yi-2.j + + Yi-i.j + Yi.j-^ + Fi+ij + Fij+i + Fi+2,j + ■Vi,i+2,- 

(d) X,"^, - + (e) Xf,, := y,+i,, + and 
(f) X,^^ ~ yi-2,j + Yi.j-2 + 5^1-1 J + Yi^j-\. The scatterplot shows the observations 
{Xij,Yij) corresponding to one typical realization among 10. 

A more systematic simulation study certainly would be welcome. How- 
ever, it seems that, even in very small samples (see Figure 1), the perfor- 
mance of our method is excellent in pure spatial regression problems (with 
spatially correlated covariates), while larger samples are required in spatial 
autoregression models. This difference is probably strongly related to differ- 
ences in the corresponding noise-to-signal ratios. Letting g{x) = E(y|X = x) 
and e = Y — g{X), the noise-to-signal ratio is defined as Var{e)/Var{g{X)); 
see, for example. Chapter 4 in Fan and Gijbels (1996) for details. In a clas- 
sical regression setting, independence is generally assumed between X and 
e, so that this ratio, in simulations, can be set in advance. Such an indepen- 
dence assumption cannot be made in a spatial series context, but empirical 
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versions of the ratio nevertheless can be computed from each rephcation, 
then averaged, providing estimated values. In Model 1 this estimated value 
(averaged over the 10 replications) of the noise-to-signal ratio is 0.214. The 
values for the six versions of Model 2 (still, averaged over 10 replications) 
are much larger: (a) 12.037, (b) 13.596, (c) 43.946, (d) 47.442, (e) 116.334 
and (f) 88.287. 

5. Proofs. 

5.1. Proof of Lemma 2.2. The proof of Lemma 2.2 relies on two inter- 
mediate results. The first one is a lemma borrowed from Ibragimov and 
Linnik (1971) or Deo (1973), to which we refer for a proof. 

Lemma 5.1. (i) Suppose that (Al) holds. Let Cr{J^) denote the class 
of -measurable random variables ^ satisfying \\£,\\r '■= (E|,^|'')^/'" < oo. Let 
X G Cr{B{S)) and Y G Cs{B{S')) . Then for any I < r, s,h < oo such that 
+s~'^ + h~'^ = 1, 

(5.1) |E[Xy]-E[X]E[y]|<C||X||,||y||,[a(5,5')]^/^ 

where \\X\\2 := ||(X'X)i/2||^,. 

(ii) //, moreover, ||X|| := (X'"X)^/^ and \Y\ are F-a.s. bounded, the right- 
hand side of (5.1) can be replaced by Ca{S,S'). 

The second one is a lemma of independent interest, which plays a crucial 
role here and in the subsequent sections. For the sake of generality, and in 
order for this lemma to apply beyond the specific context of this paper, 
we do not necessarily assume that the mixing coefficient a takes the form 
imposed in assumption (A4). 

Before stating the lemma, let us first introduce some further notation. Let 

An = {hbir'/'j2vi{x) 

and 

Var(A„) =(n6^1)-i^E[A?(x)] + (n6^)-i ^ ^ E[Ai(x)Aj(x)] 

:= /(x) -h^(x), say, 

where r?j(x) := Z^Kc{x — Xj) and Aj(x) := ?/j(x) — Er7j(x). For any Cn := 
(cni , . . . , CnAf) G with 1 < c^k < ^k for all k = 1, . . . , N, define Ji (x) : = 
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^5d/(4+<5)+d j-j^^^^^^^^^-j and 



TV 



J2(x):=6^^/(2+^)ri$: 
k=i 



rife ris 



E E E Wih,-..,jN)Y^^'' 



■ bs| = l |ife|=Cnfe |js|=l . 

\s=l,...,fc-l s=k+l,...,N I 



Lemma 5.2. Let {(lj,Xj); j G Z^} denote a stationary spatial process 
with general mixing coefficient 

:= snpllP (AB) -F{A)F{B)\: A e B{{Yi,Xi}), B e Xi+j})}, 
and assume that assumptions (Al), (A2) and (A5) hold. Then 

(5.2) |i?(x)|<C(n5J^)-nJi(x) + J2(x)]. 
If furthermore ip{ji, . . . ^j^) takes the form then 

(5.3) J2(x) < C6^'^/(2+^)nf; [ |] t^-H9'(t)}'/('+')) • 

Proof. Set L = L^ = b'ry^'^'^^^^\ Defining := Zj/{|^.|<i| and Zaj := 
^j-^{|Zj|>L}> let 

7?ij(x) := ZijKc(x - Xj) and Aij(x) := r/ij(x) - £;r?jj(x), i = l,2. 
Then Zj = Zy + Zaj, Aj(x) = Aij(x) + A2j(x), and hence 



(5.4) 



EAj(x)Ai(x) = EAij(x)Aii(x) +EAij(x)A2i(x) 



+ EA2j(x)Aii(x) +EA2j(x)A2i(x). 

First, we note that 

6-'^|EAij(x)A2i(x)| 

<{6„'^Er?fj(x)}V2{6-'^E,?ii(x)}V2 

< {h-''^ZlKl{{^ - y.;)/h^)Y/^{h-J^ZlKl{{^ - Xj)/6„)}i/2 

< C{h~''nz,\''l{\z,\>L}K,{{^ - Xi)/6„)}^/2 

< C{L-^6-'^E|Zj|2+^/||^.|>^|i^e((x - Xi)/6„)}^/2 
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Similarly, 

6-'^|EA2j(x)Aii(x)| < CL„^/2 = Cbi^/(^+''> and 
6„'^|EA2j(x)A2i(x)|<C62^'^/(4+'^). 
Next, for i+j, letting i^„(x) := (l/6J^)J^(x/6n) and ilcn(x) := (l/6J^)ilc(x/6n), 
6-'^EAij(x)Aii(x) 

= 6^^{EZiiZijKen(x - Xi)Ken(x - Xj) 

- EZiiKen(x - Xi)EZijilen(x - Xj)} 

= biJJ i^cn(x - u)Jfen(x - v) 

X {5iij(u,v)/ij(u,v) -5W(u)5i')(v)/(u)/(v)}dudv, 

where 5'iij(u,v) := E(ZiiZij|Xi = u,Xj =v), and g[^\u) := E(Zii|Xi = u). 
Since, by definition, \Zii\ < L^, we have that |(j(iij(u, v)| < and |(72^^^(u) x 
g[^\Y)\<Ll Thus 

biij (u, v)/i j (u, v) - 5« (u)^^) (v)/(u)/(v) I 
<|5iij(u,v)(/i,j(u,v)-/(u)/(v))| 

+ |(ffiij(u,v)-g«(u)5«(v))/(u)/(v)| 

< L^l/. j(u, v) - /(u)/(v)| + /(u)/(v). 

It then follows from (Al) and the Lebesgue density theorem [see Chapter 2 
of Devroye and Gyorfi (1985)] that 

6-^|EAij(x)Aii(x)| 

<biJI i^cn(x-u)i^en(x-v)L2|/ij(u,v)-/(u)/(v)|dudv 

+ bill 2L2/(u)/(v)dudv 

<Cbi(^Liy Ken(x-U)(iu| +^Liy i^n(x-u)/(u)(iu| ) 

(5.5) 

Thus, by (5.4) and (5.5), 

(5.6) 6„'^|EAj(x)Ai(x)| < CL^^^^ + Cbill = Cbi^/^^+^l 

Let Cn = (cni, • • • , CnAf) G be a sequence of vectors with positive com- 
ponents. Define 

Si:={i^jeln-- \jk - ikl < Cnk for all k = l,...,N} 
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and 

^2 := {i,j G^Tn: \jk - ik\ > Cnfc for some k = l,.. .,N}. 

Clearly, Card(cSi) < 2^nni^=i Cnfc. Splitting ^(x) into (n6J^)"i(Ji + J2), 
with 

'^^•=EEEAj(x)Ai(x), £ = 1,2, 
it follows from (5.6) that 

N 

(5.7) I Jil < Cbi^/(^+^)+d Card(5i) < 2"" Cbi^/'^^+^^+'^n [] c^k- 

k=l 

Turning to J2, we have IJ2I < Z]Z]ijG52 Lemma 5.1, with 

r = s = 2 + 6 and h={2 + 5) /S, yields' 

|EAj(x)Ai(x)| 

< C(E|ZiKe((x - Xi)/6„)|2+^)2/(2+5){(^(j - i)}V(2+5) 

< Cbl^/('+'\b-^E\ZJQi{^ - Xi)/6„)|^+^)2/(^+^){v.(j - i)}^/(2+^) 
<C6^'i/{2+5){^(j_i)}5/(2+^). 

(5.8) 
Hence, 

(5.9) I J2I < Cbl^^^^+'^Y.T.M^ - := Cfe2'^/('+')S2, say. 

We now analyze the quantity S2 in detail. For any A^-tuple 0^ i = (£1 , . . . , ^tv) £ 
{0,1}^, set 

S{ii, ...,£n):= {i,j G /n : - ik\ > Cnk if 4 = 1 and 

\jk - ik\ < Cnfc if 4 = 0, = 1, . . . , iV} 

and 

i,je5(£i,../iv) 

Then 

(5.10) S2 = EEMj-i)}'^^'^'^= E m,---,^iv). 

ijG52 0^£G{0,1}^ 

Without loss of generality, consider 1^(1,0, ... ,0). Because J2\if,~ji,\>Cnk^' ' ") 
decomposes into EH=7"'~' E";=i,+c„,+i(- • •) + E":jr"'~' Er;=i,+c„,+i(- " O, 
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and E|.,-,,i<e..(---) into Ei:^-E}:i?r^;i(---)+E^-Ei:Srii(---), 

we have 

y(i,o,...,o) 

|n-il|>Cnl \i2-j2\<Cn2 |«JV -JAT I <CnJV 

E + E E+ E - 

jl=Cnl -il=Cnl-' '>j2 = l -j2 = l) 

E + E Mji,---,^-^)}'/^'-^^^ 

jiv=i -ijv=i-' 



»ll Cn2 CnAT 



E E ••• E Mji,---,j^)}^/^^+^^ 

lii|=cni b2|=i lijv|=i 



ni n2 njv 



<fi E E ••• E 

lil|=Cnl b2|=l |jjv|=l 

More generally, 

(5.11) Vihj2, . . . , < nE • • • E • • • EiV'Oi, ■ • ■ Jn)}'^^'^'^ , 

liil life I liivl 

where the sums J2\j^,\ o'^^r all values of jk such that 1 < \jk\ < n-k if 
£k = 0, and such that Cni < |ifc| < n-k if ^fc = 1. Since the summands are 
nonnegative, for 1 < Cnk < nk, we have Ej^-^j^|=c„fc(' ' ') ^ Ej^fc|=i(' ' and 
(5.9)-(5.11) imply 

\J2\<Cbl^/(^+')h 

N / ni "fe-i rife n.fe+1 

r5i« E - E E E - 

'y^-^^J fc=i\|ji|=i |jfe_i|=i|ife|=c„fe|jfe+i|=i 



\jN\ = l ) 

Thus, (5.2) is a consequence of (5.7) and (5.12). If, furthermore, (/^(ji, . . . , j'at) 
depends on ||j|| only, then 

n\ rifc-i -rife "fe+i njv 

E ••• E E E ••• E Miijii)}'/^^+'^ 

|jl| = l |ife-l|=l |jfc|=Cnfe |jfc + l|=l |jjv| = l 

rti ™fe_i rife "fc+i njv-l l-iir-i+'^Ar 

<E---E E E---E E Mt)}^/(^+^) 

|ji|=i |jfc-il=i lifel=Cnfe bfe+il=i Iiiv-i|=ii2=ji+---+i^_i+i 
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l|n|| t 



< E E ••• E wm'^^'"-'^ < E t^^-'wrn'^^'^'^-, 

*-Cnfc |jl|=l |i]V-l|=l t-Cnk 

(5.3) follows. □ 

Proof of Lemma 2.2. Observe that 

. o^ /~(x) = 6„"EA?(x) = fo-'^lEr?? - (Er;j)2] 

' =6-''[EZ?K2((x-Xj)/6„)-{EZji^e((x-Xj)/6„)}2]. 

Under assumption (A5), by the Lebesgue density theorem, 
hm / 6-'^E[Z?|Xj = u]Jf2((x-u)/6„)/(u)du = g(2)(x)/(x) / K!{u)du, 

hm / 6-'^E[Zj|Xj = u]Ke((x-u)/6„)/(u)du = ^/«(x)/(x) / E:(u)du, 

where ^^^H^) := E[Z?|Xj = x] for i = 1, 2. It is easily seen that 6„'^{EZjKc((x- 
Xj)/6n)P ^ 0. Thus, from (5.13), 

(5.14) hni /(x) =g(2)(x)/(x) ( Kl{v.)du, 

where g^^^x) = E{Z?|Xj = x} = F.{{Y; - 9(x))2|Xj = x} = Var{yj|Xj = x}. 

Let d^j^ := hn^'^^^'^'^^^ oo. Clearly, c^k < because Ukb^^^'^^^^^ > 1 for 
all k. Apply Lemma 5.2. Since, due to the fact that a > (4 + 6)N/{2 + 6), 
and N/{2 + 6)a<l/{4: + 6) 

N / oo \ 

(5.15) {nhir'h < C E E t'^-'Wit)]''^'^'^ - 

A:=l V t=Cnfc / 

because Cnfc — > oo, (5.3) and assumption (A4) imply that 

{nhi)-\h < Cbi^/^'+'^Cr^i • • • cniv = Cbi^/i^+^)b-'''^/(^+')- ^ 0, 
hence, by (5.2), that 

(5.16) \R{x)\ = ihbiy^\Jix)\ < c{hbiy\ji + J2) ^o. 

Finally, (2.7) follows from (5.14) and (5.16), which completes the proof of 
Lemma 2.2. □ 

Proof of Lemma 2.3. From (2.5) and the definition of [recah that 
ao =5(x), ai =5r'(x)], 

E[An] = (fi6^)V25^<iE[Zj]Ke(^^ 
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= (hbi)'/' b-'E{Y^ - ao - al (Xj - x))i^e (^^) 

= {hbi)'/X'H9{^i) - «0 - aI(Xj - x))Ke 
= (n6^)V26-'^E(Xj-xr 



Ac 



Xj -X 



X 5"(x + ^(Xj - x))(Xj - x)A-e (^^^) (where |^| < 1) 

= {hbi)'/'bl b-HvE 

the lemma follows via assumption (A3). □ 

Proof of Lemma 3.1. The proof consists of two parts and an addi- 
tional lemma (Lemma 5.3). Recalling that 

(5.17) 77j(x):=ZjJfe(x-Xj) and Aj(x) := 7?j(x) - Eryj(x), 
define Cnj := ^n'^^^Aj, and let Sn := E"*=i;fc=i,...,Ar Cnj- Then 

n-V2^„ = {nbi)'/^c^Wn{^) - EW^n(x)) =An- EA^. 

Now, let us decompose n~^/^S'n into smaller pieces involving "large" and 
"small" blocks. More specifically, consider [all sums run over i:=(ii,...,ZAr)] 

jk{Pk+q)+Pk 
^^(l,n,x,j) := Cni(x), 

ife=ife(pfe+ij)+i 

k=l,...,N 

jk{Pk+q)+Pk (ijv+i)(pjv+g) 
i7(2,n,x,j):= ^ ^ Cni(x), 

«fc=ifc(Pfe+9) + l iN=jN{PN+q)+PN + l 

k=l,...,N-l 

3k(Pk+q)+Pk (iiv-i+i)(piv-i+'?) jN(PN+q)+PN 

i7(3,n,x,j):= E E E Cni(x), 

*fe=ifc(pfe+i2)+i «jv-i=ijv-i(pjv-i+g)+pjv-i+i «jv=ijv(pjv+g)+i 

fc=l,...,7V-2 

3k{Pk+q)+Pk (jiV-i+l)(PiV-i+(?) (jjv+l)(pjv+'?) 

i7(4,n,x,j):= EE E Cni(x), 

«fe=ifc(Pfe+9)+l ijv-i=jjv-i(pjv-i+g)+pjv-i+l «jv=jjv(pjv+'?)+pjv+l 
fc=l,...,Af-2 

and so on. Note that 

{ife+i){Pfe+'?) jN{pN+q)+PN 

C/(2^-l,n,x,j):= E E Cni(x) 

«fe=ifc(Pfe+9)+Pfc+l «iv=jiv(p jv+g)+l 
fc=l,...,Af-l 
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and 

(ifc+i){pfc+(?) 
C/(2^,n,x,j):= ^ Cni(x). 

ik=jk(Pk+q)+Pk+^ 

k=l,...,N 

Without loss of generality, assume that, for some integers ri,...,rjv, n = 
(ni, . . . , riAf) is such that ni = ri(pi + (7), . . . , = rjy^p^ + 9), with — > cxd 
for all k = I, . . . , N . For each integer 1 < i < 2^, define 

T(n,x,i):= ^ [/(i,n,x,j). 

jfc=0 
fc=l,...,Ar 

Clearly, Sn = J2i=i T{n, x, i). Note that r(n, x, 1) is the sum of the random 
variables Cni over "large" blocks, whereas T(n, x, i), 2 < z < 2^, are sums over 
"small" blocks. If it is not the case that ni = ri{pi + q), . . . , un = r^lp^ + q) 
for some integers ri,...,r7v, then an additional term T(n,x, 2''^ + 1), say, 
containing all the Cnj's that are not included in the big or small blocks, 
can be considered. This term will not change the proof much. The general 
approach consists in showing that, as n — > 00, 

rk-'^ 

E[exp[mr(n,x,l)]] - [] E[exp[inC/(l,n,x,j)]] 

jk=0 
k=l,...,N 



(5.18) Qi 



0, 



/ 2^ \ 2 

(5.19) Q2:=fi"'E[^5^r(n,x,f)j ^ 0, 



(5.20) Q3 



E[C/(l,n,x,j)p^a2 

ife=o 

k=l,...,N 
rk-i 



(5.21) Q^:=n-' ^ E[(C/(1, n,x, j))2/{|C/(l, n, x, j)| > ecrn^/^}] ^ 0, 



jk=0 
k=l,...,N 



for every e > 0. Note that 



: T(n, X, l)/(anV2) + ^ r(n, x, i)/{an 



-1/2 



i=2 



The term ^|^2 ^(^^i 0/('^^^^^^) is asymptotically negligible by (5.19). The 
random variables ?7(l,n, x,j) are asymptotically mutually independent by 
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(5.18). The asymptotic normality of T(n,x, l)/(crni/2) follows from (5.20) 
and the Lindeberg-Feller condition (5.21). The lemma thus follows if we can 
prove (5.18)~(5.21). This proof is given here. The arguments are reminiscent 
of those used by Masry (1986) and Nakhapetyan (1987). 

Before turning to the end of the proof of Lemma 3.1, we establish the 
following preliminary lemma, which significantly reinforces Lemma 3.1 in 
Tran (1990). 

Lemma 5.3. Let the spatial process {l^,Xi} satisfy the mixing property 
(2.1), and denote by Uj, j = 1, . . . , M , an M -tuple of measurable functions 
such that Uj is measurable with respect to {(yi,Xi),i where Ij dn- 

If Card(Xj) < p and d{l£,Ij) > q for any j , then 



E 



exp < iu ^ Uj 



M 

nE[exp{m[/,}] 



M-l 

<C ^ ^{p,{M-j)p)^{q), 



where i = \f—l. 



Proof. Let aj := exp{in[/j}. Then 

E[ai • • • au] - E[ai] • • • E[aA/] 

= E[ai • • • om] - E[ai]E[a2 • • • au] 

+ E[ai]{E[a2 • • • au] - E[a2]E[a3 • • • oa/]} 

+ • • • + E[ai]E[a2] • • • E[aA/-2]{E[aM-iaA/] - E[aA/_i]E[aA/]}. 

Since |E[ai]| < 1, 

|E[ai • • - OA./] - E[ai] • • •E[aAf]| 

< |E[ai • • - OA/] - E[ai]E[a2 • • -flA/]! 

+ |E[a2 • • • au] - E[a2]E[a3 • • • oa/]! 

H h |E[aAf-iaA/] - E[aA/-i]E[aA.f]|. 

Note that d{Ii,Ij) > q for any i^j. The lemma then follows by applying 
Lemma 5.1(ii) to each term on the right-hand side. □ 

Proof of Lemma 3.1 (continued). In order to complete the proof of 
Lemma 3.1, we still have to prove (5.18)-(5.21). 

Proof of (5.18). Ranking the random variables C/(l,n,x,j) in an ar- 
bitrary manner, refer to them as Ui, . . . ,Um ■ Note that M = JJ^^-^^r^ = 
n{nf=i(Pfc + q)}"^ < ^/P: where p = Uk=iPk- Let 

J(l,n,x,j) := {i:jkiPk + q) + 1 < 4 < jk{Pk + q) +Pk,k = l,.. .,N}. 
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The distance between two distinct sets T(l,n,x,j) and T(l,n,x,j') is at 
least q. Clearly, T(l,n,x,j) is the set of sites involved in C/(l,n,x,j). As for 
the set of sites Xj associated with Uj, it contains p elements. Hence, in view 
of Lemma 5.3 and assumption (A4'), 

M-l 

Qi < C ^ min{p, (M - k)p}ip{q) < CMp(p{q) < Ch(f{q), 
k=l 

which tends to zero by condition (B2). □ 

Proof of (5.19). In order to prove (5.19), it is enough to show that 

n"^E[r2(n,x,i)] ^0 for any 2 < i < 2^. 

Without loss of generality, consider E[T^(n, x, 2)]. Ranking the random vari- 
ables ?7(2,n, x,j) in an arbitrary manner, refer to them as Ui, . . . , Um- We 
have 

E[r2(n,x,2)] = ^Var([/,) + 2 ^ Cov(C/„C/,) 

i=l l<i<j<M 

:= Vi + V2 say. 
Since X„ is stationary [recall that Cnj(x) := &n'^^^Aj(x)], 
V — — ^ 

Var(^,)=E 

A jfe=l i]v=l 



E E Cni(x) 

*fe=l '' 
k=l....,N-l 



+ E[C„j(x)Cni(x)]:=yii+Fi2, 



where J = J{p,q) := {i,j : 1 < ik,jk<Pk,k = l,...,N-l, and 1 < inJn < 
q}. Prom (5.13) and the Lebesgue density theorem [see Chapter 2 of Devroye 
and Gyorfi (1985)], 

/N~l \ /N-1 \ /N-1 \ 

Vii = n Pfe '?Var{C„i(x)} = n Pkjqlb-^EAfi^)} <cil[pk]q. 
\k=l / \k=l I \k=\ ) 

Thus, applying Lemma 5.2 with =pfc, A; = 1, . . . , A'^ — 1, and njv = q yields 
^12 =6n' E E[Aj(x)Ai(x)] 



h'^l^'^'M-^PkCr.k] 



qCnN 



.k=l 

/N-l \ N \\n\\ 



\ k = l ) k=lt=Cnk 
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N oo 
A:=li=Cnfc 



It follows that 
Set 

J(2,n,x,j) := {i:jk{pk + q) + l<ik < jk{Pk + q) + PkA < k < N - 1, 

3n{pn + q)+PN + '^<iN< Un + 1){PN + q)}- 

Then C/(2,n,x,j) = Z]ieJ(2,n,x,j) Cni(x). Since pk > g, if i and i' belong to 
two distinct sets X(2,n,x,j) and T(2,n,x,j'), then ||i — i'|| > q. In view of 
(5.8) and (5.22), we obtain 

\V2\<C J2 E |E[Cni(x)C„j(x)]| 

{ij ■ l|i-j||>'?. l<«fc Jfc<"fe} 

<Cb-' E J2 |E[A„i(x)A„j(x)]| 

{ij : l|i-j||>'?. l<«fc Jfc<'"fe} 
{iJ : l|i-j||>'?. l<«fc Jfc<'"fe} 

(5.24) < (^f[nk^ (e*^"'{^W}'/^'^'^) • 

Take c^^ = b^^'^/^'^^^^ 00 _ Condition (B3) implies that qbif^"'^'^^^^ > 1, so 
that Cnfc < 9 < Pk- Then, as proved in (5.15) and (5.16), it follows from 
assumption (A4) that vTn ^ 0. Thus, from (5.22), (5.23) and (5.24), 

n-iE[r2(„,x,2)] < C{q/pM)[l + vr„] + (^f:t^-H'^(0}'/('+'^) , 

which tends to zero by q/pN and condition (B3); (5.19) follows. □ 
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Proof of (5.20). Let S^, := r(n,x, 1) and S'^^ := Ef=2 ^(n,x, i). Then 
5^ is a sum of Ij's over the "large" blocks, S'^ over the "small" ones. 
Lemma 3.2 implies n~"^E[|S'np] — > cr^. This, combined with (5.19), entails 
n-iE[|S^|2]^c72. Now, 



h-'E[K\']=n-^ E[C/2(l,n,x,j)] 



(5-25) ,X',N 

+ n-i J2 Cov(C/(l,n,x,j),C/(l,n,x,i)), 

where J* = J*{p,q) := {i,j:l <ifc,jfc < - 1, A; = 1, . . . , A^}. Observe that (5.20) 
follows from (5.25) if the last sum in the right-hand side of (5.25) tends 
to zero as n ^ oo. Using the same argument as in the derivation of the 
bound (5.22) for V2, this sum can be bounded by 



00 



||i||>ij «fe=l \t=q / 

k=l,...,N 

which tends to zero by condition (B3). □ 

Proof of (5.21). We need a truncation argument because Zi is not 
necessarily bounded. Set Z-^ := ZiI^\Zi\<L}: •= ^i^-f^c((Xi — x)/6n), A^' := 

— Er;/^, := b^'^^'^Af', where L is a fixed positive constant, and define 
i7^(l,n,x,j) :=Eigx(i,n,xj)C^i- Put 

Qi:=n-i ^ E[([/^(l,n,x,j))2/{|[/^(l,n,x,j)|>e<Tni/2}]. 

jk=0 
k=l,...,N 

Clearly, iC^j] <CLb~'^''^. Therefore |C/^(1, n, x, j)| <CLpb~'^^^. Hence 
Qi<CpVfi~' E P[f/''(l,n,x,j)>eani/2]. 

jk=0 
k=l,...,N 

Now, C/^(l,n,x,j)/(o-ni/2) < Cp(n6^)-^/2 ^ 0, since p = [(n6^)^/V'5n], where 
Sn — > 00. Thus P[f/^(1, n, X, j) > ecrn^/^] = at all j for sufficiently large fi. 
Thus Q4 = for large n, and (5.21) holds for the truncated variables. Hence 

Uk 

(5.26) n~y'St:=n~'/' E Ci^i ^ N{0,al), 

jk=i 
k=l,...,N 
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where al := Var(Zi^|Xi = x)/(x) / K^{u) du. 

Defining S^* := E;;=i;fc=i,...,7v(Cnj - C^j), we have 3^ = 8^ + S^*. Note 
that 

|E[exp(iu5n/n^/^)] - exp(-uVV2)| 

< |E[exp(m5^/ni/2) _ exp(-uVi/2)] exp(m5^7n^/2)| 

+ |E[exp(m5^7ni/2) _ ^ exp(-nVi/2)| 
+ I exp(-nVi/2) - exp(-'uVV2)| 
= Ei + E2 + E3, say. 

Letting n— > oo, E'l tends to zero by (5.26) and the dominated convergence 
theorem. Letting L go to infinity, the dominated convergence theorem also 
implies that af^ := Var(Z/'|Xi = x)/(x) / K^{u) du converges to 

Var(Zi|Xi=x)/(x) J K^{u) du = Var(yi|Xi = x)/(x) J K^{u) ^^u:=(T^ 

and hence that tends to zero. Finally, in order to prove that E2 also 
tends to zero, it suffices to show that S^* /h^^'^ ^ in probability as first 
n — > 00 and then L — > 00, which in turn would follow if we could show that 

E[(S^7ni/2)2] ^ Var(|Zi|/{|2^|>i}|Xi = x)/(x) J K^,{u) du as n ^ 00. 

This follows along the same lines as Lemma 3.2. □ 

The proof of Lemma 3.1 is thus complete. □ 
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